Saturday, December 17, 2005

Not my bug

Often most debugging time is spent of pretty uninteresting bugs.

For example, most part of last Friday I was staring into debugger backed up by MIPS simulator, trying to figure out why exactly one test of 10 run decided to crash on a MIPS board. Let me first show you the code that was found guilty in the end:


static void itoa_really(unsigned long long value, unsigned int base)
{
    static char digits[] = {'0', '1', '2', '3', '4', '5',
                '6', '7', '8', '9', 'a', 'b', 'c', 
                'd', 'e', 'f'};

    if (!value)
        return;
    else {
        long rem;
        itoa_really(value / base, base);
        rem = value % base;
        target_putchar(digits[rem]);
    }
}
This is extremely basic code, but yes, it did broke everything.

Initially, the crash happened on a real MIPS board, on a pretty complex test. It was reduced to a test printing just two 64-bit values, and crash depended on the printed values. Say, printing 0 worked, but printing 0x11223344556677 crashed. It was not possible to debug the problem on the board, and the few floating licences for simulator were in use, so I tried to solve the problem by poking around. At one time I've added an extra line of code before call to "target_putchar", and the line was:


rem = 3;
Strangely, the crash was gone.

At this point I though that maybe there's some really wrong with 64 on 32 division, and 'rem' can become wrong, so I replace that "rem = 3" line with:


if (rem < 0) { rem = 0; }
if (rem > 15) { rem = 15; }
The crash appeared again.

Another try was:


if (rem < 3) { rem = 3; }
if (rem > 3) { rem = 3; }
and the crash was still there.

Finally, with:


if (rem <= 3) { rem = 3; }
if (rem > 3) { rem = 3; }
everything worked. Of course, all values was printed as "3333.....333", but it did not crash. Completely buffled, I decided simulator is the only help and eventually we've grabbed the license.

A couple of hours later, I found the one-character fix (literally). The linker config file specified mere 1KB of stack size, and when printing too large values, itoa_really recursed too deep, overflowing the stack, and overwriting a bunch of program variables. Changing to 4KB of stack set it all right. And what's even more funny, I never ever touched that linker config file, and have no idea what kind of programs are supposed to work on such stack space diet.

Tuesday, December 13, 2005

Whatever you name them

I've just wrote the following code:
void on_vector_write(unsigned address, unsigned address)
    {
        if (w_on_vector_write)
            w_on_vector_write(address,
    }
It's not complete -- I was just about to pass a second parameter to "w_on_vector_write" when I realized that the two parameters of my function have the same name. But heh, I've just copy-pasted that signature, and the code I've pasted from certainly compiles. Here's how that compiling code looks:
boost::function<void (unsigned address,unsigned address)> 
    on_vector_write;
Apparently, gcc does not care about parameter names that are used when forming function type. And it does not care about function names in function prototypes either, even in 4.0.